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To determine the universality class of critical phenomena, we propose a method of statistical 
inference in the scaling analysis of critical phenomena. The method is based on Bayesian statistics, 
most specifically, the Gaussian process regression. It assumes only the smoothness of a scaling 
function, and it does not need a form. We demonstrate this method for the finite-size scaling 
analysis of the Ising models on square and triangular lattices. Near the critical point, the method 
is comparable in accuracy to the least-square method. In addition, it works well for data to which 
we cannot apply the least-square method with a polynomial of low degree. By comparing the data 
on triangular lattices with the scaling function inferred from the data on square lattices, we confirm 
the universality of the finite-size scaling function of the two-dimensional Ising model. 
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I. INTRODUCTION 

A wide variety of systems exhibit critical phenomena. 
Near a critical point, some quantities obey scaling laws. 
As an example, consider 



A(t,h) = t x ^{hr y ), 



(1) 



where t and h are variables describing a system, and the 
critical point is located at t = h = 0. The scaling law is 
derived by the renormalization group argument 1]. The 
scaling exponents x and y are called critical exponents. 
The universality of critical phenomena means that dif- 
ferent systems share the same set of critical exponents. 
Thus, this set defines a universality class of critical phe- 
nomena. In addition, the scaling function '5 also exhibits 
universality. For example, Mangazeev et al. numerically 
obtained scaling functions of the Ising models on square 
and triangular lattices Since Ising models on both lat- 
tices belong to the same universality class, the two scaling 
functions with nonuniversal metric factors are perfectly 
equal. 

An important issue to study critical phenomena is to 
determine the universality class. The object of scal- 
ing analysis is to determine the universality class from 
data. We assume the scaling law of Eq. ([T]) for data. 
If we plot data with rescaled coordinates as (Xi,Yi) = 
(hit~ v ,tY x A(ti,hi)), all points must collapse on a scal- 
ing function as Yi — ^(Xi). To determine critical expo- 
nents, we need a mathematical method to estimate how 
well all rescaled points collapse on a function for a given 
set. In other words, we need to estimate the goodness of 
data collapse. Unfortunately, we do not know the form 
of $ o priori . The conventional method for the scaling 
analysis is a least-square method while assuming a poly- 
nomial. However, it may be difficult to choose the degree 
of the polynomial for data, because there are overfitting 
problems associated with increasing the degree. To use 
a polynomial of low degree, we usually limit the data to 
a narrow region near a critical point. However, it may 
require high accuracy. In addition, it may be difficult 



to obtain a universal scaling function in a wide critical 
region. Thus, the scaling analysis by the least-square 
method must be carefully done as shown in the reference 

In this paper, we propose a method of statistical infer- 
ence in the scaling analysis of critical phenomena. The 
method is based on Bayesian statistics. Bayesian statis- 
tics has been widely used for data analysis [4|. However, 
to the best of our knowledge, it has not been applied to 
the scaling analysis of critical phenomena. In particu- 
lar, since our method assumes only the smoothness of a 
scaling function, it can be applied to data for which the 
least-square method cannot be used. 

In Sec. HH we first introduce a Bayesian framework 
in the scaling analysis of critical phenomena. Next, we 
propose a Bayesian inference using a Gaussian process 
(GP) in this framework. In Sec. IIII1 we demonstrate 
this method for critical phenomena of the Ising models 
on square and triangular lattices. Finally, we give the 
conclusions in Sec. IIVI 



II. 



BAYESIAN FRAMEWORK AND BAYESIAN 
INFERENCE IN SCALING ANALYSIS 



By using two functions X and Y that calculate rescaled 
coordinates, the scaling law of an observable A can be 
rewritten as 



Y(A(v),v,6 p ) = y(X($,6 p )), 



(2) 



where v denotes the variables describing a system and 9 p 
denotes the additional parameters as critical exponents. 
Our purpose is to infer 9 P so that data A(vi ), (1 < i < N) 
obey the scaling law of Eq. ^j. In the following, for 
convenience, we abbreviate X(vl, 9 p ) and Y(A(vl), vl, 9 P ) 
to Xi and Yi, respectively. 

When the statistical error of Yi is Ei, the distribution 
function of {Yi}, P(Y , 9 p ), is a multivariate Gaussian 
distribution with mean vector '5 and covariance matrix 



2 



(3) 



P(Y\y,O p )=Af(Y\$,e), 
where (f), = Y it (<F), = *(A 4 ), (£) tJ = Ef5 l3 , and 



Next, we introduce a statistical model for a scaling 
function as P(^\0h)- Here, 6>/j denotes the control pa- 
rameters and is referred to as hyper parameters. Then, 
the conditional probability of Y for p and Oh is formally 
denned as 



P(Y\9 p ,e h ) = J P(Y\^,9" p )Pm9" h )d^. 



(4) 



According to Baycs' theorem, a conditional probability 
of 9 p and 9h for Y can be written as 

P{9 p ,6 h \Y) = P{Y\9 p ,6 h )P{9' p ,9 h )/P{Y), (5) 

where P(9 p ,9h) and P(Y) denote the prior distributions 
of 9 p and 9h and that of Y, respectively. In Bayesian 
statistics, P(9 p , 9h\Y) is called a posterior distribution of 
#p and Oh- Using Eq. (|5|), a posterior probability of 9 p and 
9h for y can be estimated. This is a Bayesian framework 
for the scaling analysis of critical phenomena. 

In Bayesian statistics, the conventional method of in- 
ferring parameters is the maximum a posteriori (MAP) 
estimate. In this paper, for simplicity, we assume that 
all prior distributions are uniform. Then, 



P{O p ,Oh\Y)(xP(Y\O p ,0 h ). 



(6) 



Therefore, the MAP estimate is equal to a maximum 
likelihood (ML) estimate with a likelihood function of P 
and Oh, denned as 



C(9 p ,9 h ) =P(Y\0 p ,0 h ) 



(7) 



In addition, the confidence intervals of the parameters 
can be estimated through Eq. 

In this framework, the statistical model of a scaling 
function plays an important role. We start from a poly- 
nomial scaling function as ^(X) = ^ fe c k X k . If a coeffi- 
cient Ck is distributed by a probability density P{ck\9h), 
then P(^\e" h )d^ = Y[ k P{c k \(fh)dc k . We first consider 
the strong constraint for c k as P{ck\0h) = S(c k — mu), 
where m k is a hyper parameter. Then, P(Y\9 p ,9h) is a 
multivariate Gaussian distribution with mean vector /2 
and covariance matrix E: 



^m k X k , 



£. 



(8) 



Thus, the ML estimate in Eq. ([7]) is equal to the least- 
square method. We soften this constraint as P(ck\0h) = 



Af(c k \m k ,al), where m k and a k are hyper parameters. 
Then, P{Y\9 p ,9h) is again a multivariate Gaussian dis- 
tribution, and the covariance matrix changes as follows: 



£ = £ + £', (E')i 



k 



Gu. 



(9) 



This includes the case of a strong constraint such as a\ = 
0. 

To calculate a MAP estimate, a log-likclihood function 
is used. If a posterior distribution is described by a mul- 
tivariate Gaussian function as P{9 p , 0h\Y) cx Af(Y\jl, £), 
the log-likelihood function can be written as 

logC&X) = -ilog|27rE| - i(Y - ftyZ-^Y - p). 

Although the likelihood function is nonlinear in parame- 
ters p and Oh, a multidimensional maximization method 
may be applied to calculate a MAP estimate. Under 
a strong constraint such as a\ = 0, the Levenberg- 
Marquardt algorithm is efficient. Under a weak con- 
straint such as a\ > 0, we may use an efficient max- 
imization algorithm such as the Fletcher-Reeves conju- 
gate gradient algorithm. In such efficient algorithms, we 
sometimes need the derivative of Eq. (ITU)) for a parameter 
0. Then, we can use the following formula: 



dlog£(0p,0 h 
00 



1. 



-Tr S 



(Y #*E 



~d0 
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1 8Tj 

+ -(Y-^- 1 —j:-\Y- f i).(ii) 

However, to compute the inverse of a covariance matrix, 
the computational cost of an iteration is 0(N 3 ). On the 
other hand, 0(N 2 ) for the least-square method. For- 
tunately, using a high-performance numerical library for 
linear algebra, we can easily make a code and we can effi- 
ciently calculate for some hundred data points. Another 
method is based on Monte Carlo (MC) samplings. In 
particular, MC samplings may be useful for the estimate 
of the confidence intervals of parameters. 

We demonstrate the MAP estimate based on Eq. (fTUj) 
and Eq. © . Fig. [T] shows the data points rescaled by a 
MAP estimate. Here, we assume that a scaling function 
is linear. To show the flexibility of Bayesian inference, 
we fix mo = toi = 0. Thus, and o~\ are the only free 
parameters. We artificially generate mock data so that 
they obey a scaling law: 



A(T, L) = L~ / v ty((T - T C )L Y I V ), 



(12) 



where T and L denote the temperature and linear di- 
mension of a system, respectively. This is a well-known 
scaling law for finite-size systems. In Fig. [TJ we set 
T c = fi/v = 1,1/ v= 2 and V(X) = 2 + X. Then, 



-4, 
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+ (Ti-l)Li + n/50, 



(13) 
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FIG. 1. (Color on-line) Left panel: The data points rescaled 
by a MAP estimate. We assume that a scaling function is 
linear. The results of the MAP estimate for T c , 0/u, and 
1/u are 1.00745, 0.999008, and 2.00638, respectively. The 
dotted (pink) line is the scaling function inferred from the 
MAP estimate. Inset of left panel: Mock data set. Right 
panel: Maximization of a likelihood. 



In general, a scaling function is smooth. Since (i(X) 
in Eq. (fT5)) is the weighted sum of kernel functions, the 
kernel function should smoothly decrease for increasing 
distance between two arguments. In this paper, we pro- 
pose the use of a Gaussian kernel function (GKF) for the 
scaling analysis of critical phenomena. GKF is denned 
as 

K G {X uXj ) ee o 2 exp (- (Xi ~f j)2 ) , (16) 

where 9q and 9\ are hyper parameters. Since GKF is 
smooth and local, the GP regression with GKF may be 
effective for a wide class of scaling functions. 

III. BAYESIAN FINITE-SIZE SCALING 
ANALYSIS OF THE TWO-DIMENSIONAL ISING 
MODEL 



where is a Gaussian noise. These mock data are shown 
in the inset of the left panel of Fig. [1] The right panel 
of Fig. [1] shows the maximization of a likelihood, when 
we start from T c = 1/u = j3/v = and gq = o\ = 2. 
The results for T c ,fi/u, and 1/u arc 1.00745, 0.999008, 
and 2.00638, respectively. They are close to the correct 
values. 

Unfortunately, we usually do not know the form of a 
scaling function a priori . The Bayesian inference based 
on Eq. (|10p and Eq. ([9]) may not be effective in some 
cases. Thus, we consider an extension of Eq. ([9]) . From 
Eq. (fT0|) . we may regard data points as obeying a GP. 
Since the covariance matrix represents statistical corre- 
lations in data, we may design it for a wide class of scaling 
functions. Thus, we introduce a generalized covariance 
matrix E as 

n = £ + n',(n') ij = K(x i ,x j ), (14) 

where K(Xi, Xj) is called a kernel function. Note that £' 
must be a positive definite. The Bayesian inference based 
on Eq. (fT0|) and Eq. (fi"4| is called a GP regression. Eq. © 
is a special case of Eq. (fl"4| . As shown in Fig. [TJ even if 
jl = 0, the GP regression is successful. For simplicity, we 
consider only a zero mean vector (/t = 0) in this paper. 

In the GP regression, we can also infer the scaling func- 
tion. In fact, we assume that all data points obey a GP. 
In other words, the joint probability distribution of ob- 
tained data points and a new additional point (X, Y) 
is also a multivariate Gaussian distribution. Therefore, 
a conditional probability of Y for obtained data can be 
written by a Gaussian distribution with mean fi(X) and 
variance a 2 (X): 

fi(X) ee j^E" 1 ?, (7 2 {X) ee K(X, X) - fe*E _1 ifc, (15) 

where (k)i = K{X h X). We regard (i(X) in Eq. (JTSJ) as 
a scaling function. For example, the dotted (pink) line 
in Fig.[T]is ^i(X) in Eq. (JTSJl for mock data with a MAP 
estimate. 



We demonstrate the GP regression with GKF for the 
finite-size scaling (FSS) analysis of the two-dimensional 
Ising model. FSS is widely used in numerical studies of 
critical phenomena for finite-size systems. It is based 
on the FSS law derived by the renormalization group 
argument. The Hamiltonian of the Ising model can be 
written as 

H({ Si })EE-J^ SiSj , (17) 

where Sj is the spin variable (±1) of site i and (ij) de- 
notes the nearest neighbor pairs and J denotes a positive 
coupling constant. The partition function can be written 
as 

Z=J2^M-H({si})/k B T], (18) 

{Si} 

where ks is the Boltzmann constant. For simplicity, we 
set J/ks = 1 in the following. The two-dimensional 
Ising model has a continuous phase transition at a finite 
temperature. Since there are exact results for the Ising 
models on square and triangular lattices @, we can check 
the results of FSS. To obtain the Binder ratios@ and 
magnetic susceptibility on square and triangular lattices, 
MC simulations have been done. For the square lattice, 
L = L r = 64, 128, and 256, where L r and L denote the 
number of rows and columns of the lattice, respectively. 
For the triangular lattice, L = (65L r /75) = 65, 130, and 
260 so that the aspect ratio of a triangular lattice is ap- 
proximately 1. We set periodic boundary conditions for 
both lattices. The number of MC sweeps by the cluster 
algorithm [7| is 80000 for each simulation. The Binder ra- 
tio is based on the ratio of the fourth and second moments 
of an order parameter. The order parameter of the Ising 
model is a magnetization defined as M = J^. Sj. Then, 
the Binder ratio can be written as 
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FIG. 2. (Color on-line) The Binder ratios of the Ising model 
on three square lattices. The total number of data items is 
86. Inset: Binder ratio near a critical point. The value of the 
Binder ratio is limited to the region [0.8,0.97]. The number 
of data items in the inset is 24. 



FIG. 3. (Color on-line) Result of a Bayesian FSS of the Binder 
ratio of the Ising model on square lattices. We apply the 
GP regression with Eq. (|23p to the data shown in Fig. [2] 
The results of the MC estimate are 1/T C = 0.440683(7) and 
l/v — 0.996(2). The dotted (pink) curve is the scaling func- 
tion inferred from a MAP estimate. 



where (•) denotes the canonical ensemble average. In the 
thermodynamic limit, the Binder ratio takes values 1 and 
in the order and disorder phases, respectively. Since the 
Binder ratio is dimensionless, the FSS form is 

U(T, L) = * B ((1/T - I/TJL 1 /"), (20) 

where T c is a critical temperature and v is a critical expo- 
nent that characterizes the divergence of a magnetic cor- 
relation length. From Eq. (|2U1) . the value of the Binder 
ratio at the critical temperature is universal. Magnetic 
susceptibility can be written as 

x ^ J-((M 2 )-(Af) 2 ), (21) 

where V is the number of spins. The scaling form of 
magnetic susceptibility is 

X (T, L) = LT/"* X ((1/T - I/TJL 1 /"), (22) 

where 7 and v are critical exponents. 

We first apply the GP regression to the Binder ratios 
of square lattices shown in Fig. [2] The kernel function 
based on GKF can be written as 

K{X h Xj) = K G {X u Xj) + 9 2 2 S tJ , (23) 

where a hyper parameter 82 denotes the data fidelity. We 
note that the maximization of a likelihood is much im- 
proved by 62- Although 9 2 finally goes to zero, it helps 
to escape from a local maximum of a likelihood. Fig. [3] 
shows the result of the GP regression for Binder ratios. 
The results of the MC estimate are 1/T C = 0.440683(7) 
and l/v = 0.996(2). This is consistent with the ex- 
act results l/T c = ln(l + v / 2)/2 = 0.4406867925 • • • and 
l/v = 1. The dotted (pink) curve in Fig. [3] is the scaling 
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FIG. 4. (Color on-line) Result of a FSS of the Binder ratio of 
the Ising model on square lattices by the least-square method. 
In the least-square method, we use only the data in the inset 
of Fig. [2] and assume that the scaling function is a quadratic 
function. The best estimate of the least-square method is 
1/T C = 0.44069(2) and l/v = 1.00(2). All data points are 
rescaled by these values. The data points used in the least- 
square method are shown in the filled gray (pink) region. The 
dotted (pink) curve is the scaling function inferred from the 
best estimate of the least-square method. Inset: Rescaled 
data points used in the least-square method. 



function inferred from a MAP estimate by using Eq. f| 15[) . 
All points collapse on this curve. The value of the Binder 
ratio at the critical temperature is 0.9158(4). This is con- 
sistent with the exact value 0.916038 • • -[8|- It is difficult 
to represent this curve as a polynomial of low degree. 
Thus, we limit the value of a Binder ratio to the region 
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FIG. 5. (Color on-line) Left panel: Result of a Bayesian FSS 
for the data in the inset of Fig. [2] The results of the MC 
estimate are l/T c = 0.44070(2) and l/v = 1.00(1). Right 
panel: Result of a Bayesian FSS for data not included in the 
inset of Fig. [2] The results of the MC estimate are 1 /T c ~ 
0.440675(9) and l/v = 0.997(2). The dotted (pink) curves in 
left and right panels are the scaling functions inferred from 
the MAP estimates. 



[0.8,0.97] (see the inset of Fig. EJ. We apply the least- 
square method with a quadratic function to the limited 
data. The result is shown in Fig. 2J The inset of Fig. [H 
shows the data points rescaled by the best estimate of 
the least-square method. All points in the inset collapse 
on a quadratic function (see the dotted (pink) curve in 
Fig. SJ). The reduced chi-square is 2.96. The results 
of the least-square method are 1/T C = 0.44069(2) and 
l/v = 1.00(2). This is consistent with the exact result. 
However, it may be difficult to extend the region of data 
for the least-square method. The main panel of Fig. @] 
shows all data points rescaled by the best estimate of the 
least-square method. While all points again collapse on 
a smooth curve, the curve is not equal to the quadratic 
function outside the limited region (see the filled gray 
(pink) region in Fig. 0]). The left panel in Fig. [5] shows 
the result of the GP regression to the same data for the 
least-square method. The results of the MC estimate are 
1/T C = 0.44070(2) and l/v = 1.00(1). This is consistent 
with the exact results and similar to that of the least- 
square method. The GP regression with GKF assumes 
only the smoothness of a scaling function. Thus, it may 
be effective even for the data not near a critical point. In 
fact, even if we use only data not included in the inset of 
Fig.[2l we can do FSS by the GP regression. The result is 
shown in the right panel in Fig. [5] The results of the MC 
estimate arc 1/T C = 0.440675(9) and l/v = 0.997(2). Al- 
though we do not use the important data near a critical 
point, the result of the GP regression is close to the exact 
result. 

We also apply the GP regression to the magnetic sus- 
ceptibility of square lattices. The result is shown in 
Fig. H The results of the MC estimate are 1/T C = 
0.44072(8), l/v = 0.98(2), and "j/v = 1.74(2). This is 
consistent with the exact result {"l/v = 7/4 = 1.75). The 
dotted (pink) curve is the scaling function inferred from 
the MAP estimate by using Eq. (|15l) . All points collapse 
on this curve. However, it is difficult to represent this 
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FIG. 6. (Color on-line) Result of a Bayesian FSS of the mag- 
netic susceptibility of the Ising model on square lattices. We 
apply the GP regression with Eq. (|23p to data with the same 
temperatures and lattice sizes of data as in Fig. [2] The results 
of the MC estimate are l/T c = 0.44072(8), l/v = 0.98(2), and 
j/v = 1.74(2). The dotted (pink) curve is the scaling function 
inferred from a MAP estimate. 
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FIG. 7. (Color on-line) Result of a Bayesian FSS of the Binder 
ratio of the Ising model on triangular lattices. We apply the 
GP regression with Eq. (|23|) . The results of the MC estimate 
are l/T c = 0.274652(7) and l/v = 0.989(4). The dashed 
(light-blue) curve is the scaling function of a square lattice in 
Fig.[3]with a nonuniversal metric factor G\ = 1.748(3). Inset: 
Binder ratios of the Ising model on triangular lattices. The 
number of data items is 86. 



curve as a polynomial of low degree. 

Next, we apply the GP regression to the Binder ratio 
and magnetic susceptibility on triangular lattices. These 
results are shown in Fig. [7] and Fig. [5J respectively. All 
points of each quantity collapse on a curve. The results of 
the MC estimate for 1/T C , l/v, and "j/v are summarized 
in Tab. IIII1 Although they are almost consistent with the 
exact results, the accuracy of inference is lower than that 
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FIG. 8. (Color on-line) Result of a Bayesian FSS of the 
magnetic susceptibility of the Ising model on triangular lat- 
tices. We apply the GP regression with Eq. (|23p to the data 
with the same temperatures and lattice sizes of data as in 
Fig-El The results of the MC estimate are 1/T C = 0.27466(7), 
l/u = 0.95(2), and j/v = 1.71(2). The dashed (light-blue) 
curve is the scaling function of a square lattice in Fig. [6] with 
nonuniversal metric factors Ci = 1.70(2) and C2 = 0.777(7). 



for the data of square lattices. Since the region of the 
data of triangular lattices is wide (compare Fig. [7] with 
Fig. [3]) , we may consider the correction to scaling. 

Privman and Fisher proposed the universality of the 
finite-size scaling function Q . If two critical systems be- 
long to the same universality class, the two finite-size 
scaling functions with nonuniversal metric factors are 
equal as 



(24) 



where "F and W are finite-size scaling functions and 
C\ and C2 are nonuniversal metric factors. Hu et al. 
checked this idea for bond and site percolation on var- 
ious lattices [l(|. The Ising models on square and tri- 
angular lattices belong to the same universality. Thus, 
the two scaling functions must be equal via nonuniversal 
metric factors as in Eq. (l24l) . To check the universality 
of finite-size scaling functions, we compared the data on 
triangular lattices with the scaling function inferred from 
the data on square lattices. We estimated nonuniversal 
metric factors to minimize the residual between them. 
The result for the Binder ratio is d = 1.748(3). The 
results for the magnetic susceptibility are C\ = 1.70(2) 
and C2 = 0.777(7). Note that there is no metric factor C2 
for the Binder ratio, because the Binder ratio is dimen- 
sionlcss. The scaling functions of a square lattice with 
nonuniversal metric factors are shown using the dashed 
(light-blue) curves in Fig. [7J and Fig. [5] They agree well 
with the data on triangular lattices. The reduced chi- 
square of the Binder ratio is 2.65, and that of magnetic 
susceptibility is 0.36. Therefore, we confirm the univer- 
sality of finite-size scaling functions for the Binder ratio 



and magnetic susceptibility of the two-dimensional Ising 
model. We note that Tomita et al. [ll[ confirmed the 
universality of finite-size scaling functions for other quan- 
tities, and Mangazeev et al. [2j studied the universality 
of the scaling function in the thermodynamic limit. 



IV. CONCLUSIONS 

In this paper, we introduced a Bayesian framework in 
the scaling analysis of critical phenomena. This frame- 
work includes the least-square method for the scaling 
analysis as a special case. It can be applied to a wide 
variety of scaling hypotheses, as shown in Eq. ([2]). In this 
framework, we proposed the GP regression with GKF de- 
fined by Eqs. ITU]), (HU, and (fTB]). This method assumes 
only the smoothness of a scaling function, and it does not 
need a form. We demonstrated it for the FSS of the Ising 
models on square and triangular lattices. For the data 
limited to a narrow region near a critical point, the accu- 
racy of the GP regression was comparable to that of the 
least-square method. In addition, for the data to which 
we cannot apply the least-square method with a polyno- 
mial of low degree, our method worked well. Therefore, 
we confirm the advantage of the GP regression with GKF 
for the scaling analysis of critical phenomena. 

The GP regression can also infer a scaling function as 
the mean function /1 in Eq. (|15p . By comparing the data 
on triangular lattices with the scaling function inferred 
from the data on square lattices, we confirmed the uni- 
versality of the FSS function of the two-dimensional Ising 
model. The use of the scaling function may help in the 
determination of a universality class. 

In this paper, we assume that the data obey a scaling 
law. However, in some cases, a part of the data may not 
obey the scaling law. In such a case, we usually introduce 
a correction to scaling. If we can assume the form of a 
correction to scaling, we change only the function Y in 
Eq. However, the assumption of the correction term 
may cause a problem. In other words, the identification 
of a critical region remains. 

As shown in this paper, the GP regression is a powerful 
method. In particular, the GP regression can be applied 
to the statistical check for data collapse. For example, we 
can apply it to the estimate of nonuniversal metric factors 
in Fig. [7J and Fig. [3] Another interesting application may 
be found in the data analysis of physics. 
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TABLE I. Results of the MC estimates for 1/T C , 1/v, and j/v. The exact values of 1/T C for square and triangular lattices [f| 
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j, respectively. 



Data 


Lattice 


Method 


l/Tc 


1/v 


7/1/ 


Binder ratio 


Square 


GP regression 


0.440683(7) 


0.996(2) 




Binder ratio 


Triangular 


GP regression 


0.274652(7) 


0.989(4) 




Binder ratiq^ 


Square 


Least-square 


0.44069(2) 


1.00(2) 




Binder ratio^ 


Square 


GP regression 


0.44070(2) 


1.00(1) 




Binder ratio^ 


Square 


GP regression 


0.440675(9) 


0.997(2) 




Magnetic susceptibility 


Square 


GP regression 


0.44072(8) 


0.98(2) 


1.74(2) 


Magnetic susceptibility 


Triangular 


GP regression 


0.27466(7) 


0.95(2) 


1.71(2) 



a Data in the inset of Fig. [2] 

b Data not included in the inset of Fig. [2] 



by the National Science Foundation under Grant No. PHY05-51164. 



[1] N. Goldenfeld, Lectures on Phase Transitions and 

the Renormalization Group (Westview Press, 1992); 

J. Cardy, Scaling and Renormalization in Statistical 

Physics (Cambridge University Press, 1996). 
[2] V. V. Mangazeev, M. Y. Dudalev, V. V. Bazhanov, and 

M. T. Batchelor, Phys. Rev. E 81, 060103 (2010). 
[3] K. Slevin and T. Ohtsuki, 

|Phys. Rev. Lett. 82, 382 (1999) 
[4] C. M. Bishop, Pattern Recognition and Machine Learning 

(Springer, 2006). 

[5] L. Onsager, physical Review 65, 117 (1944)| C. N. 

Yang, Physical Review 85, 808 (1952). 



[6] K. Binder, Zeitschrift fur Physik B Condensed Matter 

43, 119 (1981). 
[7] R. H. Swendsen and J. S. Wang, 

|Phys. Rev. Lett. 58 86 (1987)| 
[8] J. Salas and A. D. Sokal, Journal of Statistical Physics 

98, 551 (2000). 
[9] V. Privman and M. E. Fisher, Phys. Rev. B 30, 322 

(1984). 

[10] C.-K. Hu, C.-Y. Lin, and J.-A. Chen, Phys. Rev. Lett. 
75, 193 (1995). 

[11] Y. Tomita, Y. Okabe, and C.-K. Hu, Phys. Rev. E 60, 
2716 (1999). 



